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Abstract. We present an algorithm for the identification of transient noise 



Of 



artifacts (glitches) in cross-correlation searches for long gravitational-wave transients 
lasting seconds to weeks. The algorithm utilizes the auto-power in each detector 
as a discriminator between well-behaved stationary noise (possibly including a 
gravitational-wave signal) and non-stationary noise transients. We test the algorithm 
with both Monte Carlo noise and time-shifted data from the LIGO S5 science run and 
find that it removes a significant fraction of glitches while keeping the vast majority 
(99.6%) of the data. We show that this cleaned data can be used to observe GW signals 
at a significantly lower amplitude than can otherwise be achieved. Using an accretion 
disk instability signal model, we estimate that the algorithm is accidentally triggered 
at a rate of less than 10 _5 % by realistic signals, and less than 3% even for exceptionally 
loud signals. We conclude that the algorithm is a safe and effective method for cleaning 
the cross-correlation data used in searches for long gravitational-wave transients. 
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1. Introduction 



Our aim is to detect long-lasting gravitational-wave (GW) transients (lasting seconds 
to weeks) in the presence of "glitches" : non-stationary noise artifacts that contaminate 
the otherwise approximately Gaussian strain noise in GW interferometers. We focus our 
attention on the cross-correlation method of though, it may be possible to extend 
this formalism to other search algorithms as well — a topic of ongoing research. Possible 
sources of long GW transients include convection in proto-neutron stars BJ5IJ6 , 3, 



rotational instabilities associated with nascent neutron stars 
instabilities in the disks of accreting systems [13, HEI, 16, It]L neutron star glitches 



12 



19, mi m 



m, 

M |25| and 



soft gamma repeaters / anomalous X-ray binaries 
dynamically formed black hole binaries (26[ 0, H 

Glitches can arise from environmental contamination such as mechanical vibrations, 
electromagnetic disturbances, circuit breaker trips, power shorts and asymmetric 
photodiode response 30[. While some glitches can be identified and removed by 



comparing GW strain channels with environmental and sub-system monitoring channels 



many remain after the first stages of data cleaning (see, e.g., 30|, |3jJ, [32J, |33|, |34], [35|, [36 
37|). These remaining glitches require special attention for two reasons. First, a high 



glitch rate can diminish the sensitivity of a search by raising the threshold required 
for an event to be statistically significant^. Indeed, below we shall show a realistic 
example wherein the required signal power for a, p = 0.1% false alarm probability event 
drops two-fold when we use our algorithm to remove glitchy segments from GW data. 
This level of improvement is not achievable with the application of existing data-quality 
flags. Second, robust glitch identification methods can improve our confidence in a GW 
candidate if it does not resemble non-stationary noise. 

We describe an algorithm to check the consistency of the auto-power from two 
terrestrial GW detectors to identify glitches in searches using the cross-power statistic 
described in [l|. (Throughout, we use the expressions "auto-power" and "cross- 
power" instead of "power spectrum," which can refer to either.) We demonstrate the 
ability of the algorithm to improve the sensitivity of targeted searches by cleaning real 
interferometer data to a level approaching optimally well-behaved Gaussian noise. 

This work builds on jssj, which described how environmental monitoring channels 
can be used to identify long-lasting noise transients. However, it differs because first, 
we utilize only GW strain channels, and second, because we are interested in recovering 
long-lasting GW signals in the presence of what are sometimes very short bursts of noise. 

f The astute reader may wonder how the present concern about glitches should be squared with 
the finding in [l[ that the SNR distributions for time-shifted and Monte Carlo "arc in qualitative 
agreement." Do we really need to worry about glitches in searches for long GW transients? The 
answer is yes. The results presented in [l[ compared the standard deviation and approximate shape of 
distributions of pixel SNR for Monte Carlo and time-shift data. While this comparison showed that the 
distributions are similar, our present analysis focuses on the high-SNR tail of the distribution of clusters 
of pixels. Since glitches tend to produce clusters of pixels of non-Gaussian noise, their importance is 
magnified when we study the distribution of cluster SNR. 
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It also differs from 38|, |39[ and other consistency-check algorithms that the authors are 
aware of because we are not checking the consistency of GW triggers, but rather we 
are checking the consistency of data segments — many of which will together constitute 
a GW trigger. This is born of necessity from our focus on long transients. We shall see 
in Section H] that by flagging individual segments as glitchy, we are able in principle to 
observe a GW event temporarily disturbed by non- stationary noise. 

To illustrate our glitch identification algorithm, we use Monte Carlo and time- 
shifted data from the 4 km LIGO HI and LI interferometers 40 in Hanford, WA 



and Livingston, LA, respectively. Time-shifting one strain time series with respect to 
another by an amount greater than the GW travel time between interferometers removes 
astrophysical signals while preserving non-Gaussian noise artifacts that are otherwise 
difficult to simulate. Our Monte Carlo assumes Gaussian noise with an initial LIGO 
design sensitivity, and our time-shifted data are from the Nov. 5, 2005 - Sep. 30, 2007 S5 



4lL |42(). During S5, the LIGO interferometers achieved a strain 

200 Hz. We 



science run [see, e.g., 

sensitivity of ~ 3 x 10~ 23 Hz -1 / 2 in the most sensitive band around 100 



utilize a few days of accumulated data from GPS=816065659— 819039020. By comparing 
how the glitch identification algorithm performs for Monte Carlo and time-shifted 
results, we can measure how close we can get to ideal Gaussian noise by cleaning non- 
Gaussian noise. While we use the LIGO HI and LI detectors for illustrative purposes, we 
expect that these techniques can be extended to addition al p airs of detectors including 



interferometers such as Virgo 43|, |44 , 



45 



46 



LCGT 



47. 



48 and GEO 



49 



50 



51 



52 



The outline for the rest of this paper is as follows. In Section [2] we summarize the 
cross power-based analysis framework from [jj . In Section [3] we develop an autopower 
difference statistic that can be used to evaluate whether the autopower in a pair of 
detectors is consistent with noise plus a GW signal. We analyze the behavior of this 
statistic for stationary noise, signals, and glitches. In Section H] we present a glitch 
identification algorithm based on the autopower difference statistic and demonstrate 
its ability to clean time-shifted LIGO data. In Section [5] we introduce an accretion 
disk instability waveform, which we use in Section |6] to investigate the safeness of our 
algorithm, i.e., the probability that it falsely identifies a signal as glitch-like. In Section [7] 
we investigate the complementarity of our algorithm to data quality flags based on 
instrumental and environmental noise artifacts. Section E] contains concluding remarks. 



2. Formalism 



Our starting point is |l|, which is described in greater detail in Appendix A We use 
the cross-correlation of two or more spatially separated interferometers to construct a 
statistic Y(t;f), which is an unbiased estimator for the GW power H(t;f) between 
times t and t + 5t in some frequency bin between / and / + 5f. H(t; f) is defined in 



terms of the GW field Fourier coefficients, h A (f) (see Appendix A[ ): 
H{t- f) = Tr [H AA ,(t- /)] = Tr ^(h A (t; f)h A ,(t- /)) 



(1) 
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Here the brackets (...) denote the expectation value of the enclosed quantity. The 
semicolon emphasizes that t refers to the beginning of a data segment of length 5t and 
not to the many sampling times associated with each segment. It is important that the 
noise in the two interferometers is uncorrelated, which is easily achieved for spatially 
separated interferometers. 

The set of Y(t;f) can be represented as an /t-map (spectrogram). The same is 
true of a(t; /), an estimator for the uncertainty associated with Y(t; f ). GW candidates 
are identified as clusters of high SNR = Y/a pixels The significance of a cluster T 
can be estimated by calculating the total SNR for the entire cluster, denoted SNR(r), 
and comparing it to the distribution of SNR(r) obtained with time-shifted data 

For sufficiently long signals, the effect of non-stationary noise is averaged away and 
SNR(r) becomes Gaussian distributed by the central limit theorem. This limiting case 
is the stochastic radiometer — a tec hniq ue for mapping the GW sky with two or more 



spatially separated interferometers [53|, |54j, |55[. Here, however, we study (relatively) 



shorter time scales where glitches play a role in our ability to determine the significance 
of an event. The question we aim to investigate in the rest of the paper is: how can we 
discriminate between large values of SNR(r) due to a GW signal and large values due 
to glitches? 

Our glitch identification algorithm will utilize cross-power Cu(t; f) and auto-power 
Pi(t; f), which are related to H(t; f) by the "pair efficiency" eu(t; fl): 

(djj(t; /)) = e u (t; Cl, a)H(t; f) e ~ 2 ^ (2) 

(Pj(t;/)> =ejj(t; ft, (3) 

Here tjj is the direction-dependent time delay between detector / and detector J and 
Nj(t; f) is the noise power in detector /. For additional details, including an expression 



for e, see Appendix A It is also useful to define Pj(t; f), the power in the 2n segments 



neighboring t: 

t=to +nSt 

£ p,(*\f) 



lMto;f). (4) 



_t=to— nSt 

In this analysis we use n = 4 neighboring segments on each side. 
3. An auto-power difference statistic 

Since the noise and the signal are uncorrelated, the expectation value of Pi(t; f) is 
given by Eq. [3j If we assume that Nj(t; f) can be estimated by looking at neighboring 
segments of noise, (i.e., the noise is stationary), then we can construct an estimator for 
the observed auto-power in detector / due to GWs: 

Pi(t; f) - Pj(t; f) ^ 
en 
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We assume that there is no (or comparatively little) signal present in the same frequency 
bin during these neighboring time4§| so that 

(#(*;/)) «#/(*;/). (6) 

Similarly, the GW auto-power in detector J is: 
Pj(t; f) - P'jjt- f) _ 

We now construct a quantity, which represents the GW auto-power difference 
between detectors I and J: 

/} _ f) - Pj(t; f) Pjjt- f) - P'jjt- f) ^ 

By construction, we expect that (H(t; /)) = for well-behaved noise plus a signal that 
is well-modeled by the pair efficiencies ejj, ejj. We note that |H(t; /)| is invariant under 
I J. 

It is desirable to normalize S(i; /) such that the new quantity is unitless with a 
near-unity variance. The variance of H(t; /) is given by: 



(Pi(t;/)) 2 + (Pi(t;/)) 2 , <^;/)> 2 + <^(*;/)) 2 2e 



2 



2 



ai(t; f) = v 2 v JV ' J// + V JK ' JJ/ i V JV W// ^ <y(t; /)> • (9) 

This motivates a normalization factor denoted <j|(t; /), which we choose to be 

^^fM^.lMM.ii^y) 1 (10) 

We shall see below that this normalization provides an effective means of creating a 
unitless signal-to- noise ratio SNR= = H/cr s , which we can use to determine if the 
auto-power in two interferometers is consistent with a GW signal plus well-behaved 
(stationary) noise. The (t; /) dependence of SNR=(i; /) is implicit. Note that SNR= is 
not equivalent to the cross-correlation statistic SNR = Y jay. 

By considering Eqs. [8] and [9], it is apparent that the qualitative behavior of SNR S 
is different for signals (Y(t;f)) > and glitches Pi(t; /) ^> Pj(t;f). Loud glitches 
in detectors /, J cause SNR= w ±1 surrounded by SNR S w =)=1 (see, e.g., Fig. [TJ 
top row). Neighboring segments are affected due to our noise estimation technique, 
which averages adjacent segments in time (see Appendix A ). Loud GW signals, on the 
other hand, cause SNR= ps surrounded by SNR S m with larger fluctuations in the 
neighboring segments. This qualitative description of the SNR= in the presence of a 
GW signal is demonstrated in Fig. |2] as well as the bottom row of /t-maps in Fig. [TJ 

§ This approximation works best for narrowband signals whose frequency varies significantly with time, 
as is the case for the examples shown here (see, e.g., Fig. Q}. When the approximation is poor, e.g., 
for a monochromatic signal, then {Pj(t; /)) may include a significant GW component as well, though, 
S(t; /) (defined in Eq. [8|) will behave much the same way as it is still the case that (£(£; /)) = by 
construction. 
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Figure 1: Ft-maps of time-shifted LIGO S5 data. The left-hand column shows cross- 
power SNR while the right-hand column shows SNR= for the same data. Top row: 
a likely glitch. Middle row: nearly stationary noise. Bottom row: stationary noise 
plus a simulated circularly-polarized accretion disk instability waveform (d = 5 Mpc) 
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4. Glitch identification 

Having introduced the auto-power difference statistic SNR=, we now present an SNR^- 
based algorithm to identify glitches. We use data collected from the LIGO S5 science 
run. Our network consists of the two 4 km LIGO interferometers mentioned in Section [TJ 
The data are time-shifted by a duration greater than the GW travel time between HI 
and LI in order to wash out the presence of astrophysical signals. To begin, we utilize 
/t-maps with 4s x 0.25 Hz pixels. 

In Fig. [2] we show the distribution of SNR^ for well-behaved noise (top), glitchy 
noise (middle) and a simulated accretion disk instability (ADI) GW signal (see Section [5]) 
injected on top of Gaussian noise (bottom). 'Well-behaved" means that there are no 
obvious high-level glitches visible in an /t-map of SNR, which is to say that the data 
approximate stationary Gaussian noise. As examples of glitchy noise, we utilize data 
from two extreme glitches; one from HI and one from LI. As stated in Section [31 we 
observe that glitches cause an excess of pixels near |SNR=| = 1. However, if we simply 
flag segments with |SNR=| m 1, we will throw out more data than necessary because 
segments neighboring a glitch also exhibit |SNR=| ~ 1. 

To discriminate between the glitch segment and its neighbors, we define an 
additional metric, the auto-power stationarity ratio: 



Here Nf is the number of frequency bins. We expect segments with a glitch to have 
~ 1 whereas neighboring segments should have Ri(t) < 1. (Of course, GW 
signals can also lead to Ri(t) > 1 so it is necessary to use R in conjunction with SNR= 
in order to separate glitches from GW events.) Glitches are unlikely to occur in two 
interferometers at the same time. 

Now we are ready to devise our glitch likely flag. A data segment (or equivalently, an 
/t-map column) is identified as glitch-like if either of the following criteria are satisfied: 



> 2.7% of pixels have -1.05 < SNR H < -0.95 and Rj{t) > 2 and R^t) < 2. (12b) 

These parameters are chosen primarily to optimize the efficiency of our algorithm at 
rejecting glitches, though some fine tuning is necessary to ensure the safety of a particular 
signal model. In this case, the parameters are adjusted for the ADI model (see Fig. EJ), 
but we shall see that they are also effective for a very different signal model (based 
on accretion disk fragmentation) in Section El Before we continue, it will be useful 
to define J 7 as the ratio of the number of pixels at some time t satisfying the criterion 
0.95 < |SNR=| < 1.05 to the total number of pixels at time t. Note that J 7 , by definition, 
must take on discrete values. 

In order for this to be an effective flag, it must not only identify glitch-like structures 
in the data, but it should also have a low false glitch rate. We define the false glitch rate 




(11) 



> 2.7% of pixels have 0.95 < SNR S < 1.05 and R T (t) > 2 and Rj(t) < 2. 



(12 a) 
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(e) (f) 

Figure 2: Histograms of SNR= /t-map pixels. Top: 2200 s of Gaussian Monte Carlo 
noise (left) and well-behaved (nearly- Gaussian) time-shifted LIGO S5 data (right). 
Middle row: two examples of glitches in HI (left) and LI (right) each consisting of 
2 s of data. These data segments were chosen to illustrate examples of strong glitches. 
Bottom: 40 s-long circularly-polarized accretion disk instability injections recovered with 
an unpolarized filter at 30Mpc (left) and at 5Mpc (right). The red bars indicate 
0.95 < |SNR S | < 1.05. The x-axis range differs between rows due to the different 
amount of data being analyzed in each case. 
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as the fraction of Gaussian-noise segments (containing no glitches) flagged as glitch- 
like per unit time. Using simulated Gaussian data, we estimate a false glitch rate of 
< 1 x 10 _3 day _1 . This false glitch rate is calculated for a frequency range between 
100 — 250 Hz consisting of 150 pixels, a range suitable for the ADI model that we will 
use to test this algorithm (see Section |5]). 

To determine the effectiveness of our flag, we perform a background study 
comparing time-shifted data (containing stationary noise and glitches) with Monte 
Carlo (stationary noise) with and without flagged data removed. An effective flag 
eliminates high-SNR events from the tail of the distribution, thereby creating better 
agreement between time-shifted and Monte Carlo data. We utilize a density-based 
search algorithm 56] to analyze 12s x 150 Hz /t-maps with 4s x 0.25 Hz pixeldjjl We 
focus on a frequency range of 100 — 250 Hz in order to study the ADI signal discussed in 
Section [5] (see Fig. [TJ bottom row). In Fig. [3] we plot p- value (false alarm probability) vs. 
SNR for Monte Carlo and time-shifted data with and without the glitch-likely flag. The 
results indicate a significant improvement in the agreement between time-shifted and 
Monte Carlo data with the application of the flag. The required SNR for a p = 0.1% 
event is reduced more than two-fold through the use of the glitch identification flag. 
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Figure 3: Plot of p- value vs. SNR for a density-based search algorithm 56J] applied 
to time-shifted LIGO S5 data (TS) and Gaussian Monte Carlo data (MC), with and 
without our SNR=-based glitch cut applied, for 4 s x 0.25 Hz pixels in a frequency band of 
100 — 250 Hz. The SNR=-based glitch cut improves the sensitivity at p = 0.1% (marked 
with a black dotted line) by more than a factor of two. The asymptotic p-value at low 
SNR is the probability that any above-threshold cluster is identified. 



Having demonstrated the efficacy of our glitch identification algorithm for the case 
of 4s x 0.25 Hz pixels in the 100 — 250 Hz band chosen to study the ADI model (see 
Section [5]), we now consider a few other cases. An exhaustive exploration of the domain 

|| The algorithm is a modified version of BurstCluster by Peter Kalmus and Rubab Khan created for 
the LIGO Flare Pipeline (see 00)- 
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of utility for this algorithm is beyond our present scope. Rather, we aim to highlight 
both the promise and the limitations of this technique by considering a few more special 
cases. In the top-left panel of Fig. HJ we plot p- value vs. SNR for 1 s x 1 Hz pixels in 
the same 100 — 250 Hz frequency band used in Fig. |3j Based on the agreement between 
time-shifted and Monte Carlo data, we conclude that even relatively short segments of 
data can be effectively cleaned with the glitch identification algorithm so as to achieve 
good agreement between Monte Carlo and time-shifted data. 

In the top-right plot, we show the case of 4 s x 0.25 Hz pixels in a higher frequency 
band: 375 — 525 Hz. Again we observe good agreement between time-shifted and 
Monte Carlo data, though, this is not surprising since this higher frequency band is 
typically dominated by nearly stationary noise. On the bottom-left, we show the case 
of 4s x 0.25 Hz pixels in a lower frequency band: 40 — 100 Hz. While the agreement 
between time-shifted and Monte Carlo data is improved with the glitch identification 
algorithm, significant disagreement remains due to non-stationary noise, which is more 
common at lower frequencies. An SNRs /i-map from a period of noisy low-frequency 
data is included in the bottom-right panel, which indicates that this effect may be 
due to quasi-continuous broadband noise rather than infrequent glitches. It is possible 
that the inclusion of additional vetoes utilizing physical environmental monitors such as 
microphones and seismometers may help achieve better agreement between time-shifted 
data in this band and Monte Carlo noise. 

Finally, in Fig. [51 we demonstrate how the glitch identification algorithm can 
be used to improve the accuracy of a long reconstructed signal by removing one or 
more glitchy segments. Motivated by models of long GW transients, which may 



last for hundreds of seconds or longer (e.g., [59|, |60|), we consider a 700 s-long ADI 
waveform. We inject the waveform into time-shifted Gaussian noise during a period 
with a known glitch (visible as a vertical column around t « 490 s). Using the density- 
based clustering algorithm described above, we recover the track without (bottom- 
left) and with (bottom-right) the glitch identification algorithm applied. The glitch 
identification algorithm correctly identifies the glitch, which is therefore excluded from 
the reconstructed event. This demonstrates not only that the glitch identification 
algorithm improves the accuracy of a reconstructed track, but also that it is in theory 
possible to observe a GW event disrupted by a glitch. While this possibility is discussed 



m 



33], this is (to our knowledge) the first time that a method has been proposed for 



removing pieces of glitchy data from the middle of a GW trigger using only strain data. 



5. Toy model waveforms 



In order to demonstrate our glitch identification algorithm, we utilize a toy model 6l| for 
a narrowband signal from an accretion disk instability, which can take place during the 
collapsar death of a star and may therefore be associated with long gamma-ray bursts 
(see also 



15 



16 



In this "suspended accretion" model, a spinning black hole (with 
mass M and parameterized by a dimensionless spin parameter a*) is surrounded by a 
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Figure 4: Top left: plot of p-value vs. SNR for the 100 — 250 Hz band using 1 s x 1 Hz 
pixels (the asymptotic p-value at low SNR differs from the 4s x 0.25 Hz case since we 
tune the clustering algorithm differently for different segment durations). The relatively 
good agreement between Monte Carlo (MC) and time-shifted data (TS) suggests that 
even short 0(1 s) segments of cross-correlated data can be effectively cleaned with our 
glitch identification flag. Top-right: plot of p-value vs. SNR for the 375 — 525 Hz band 
using 4 s x 0.25 Hz pixels. This higher frequency band exhibits good agreement between 
Monte Carlo and time-shifted data due to the nearly stationary noise associated with 
higher frequencies. Bottom-left: plot of p-value vs. SNR for the 40 — 100 Hz band 
using 4s x 0.25 Hz pixels. While the cut dramatically improves the agreement between 
Monte Carlo and time-shifted data, significant disagreement remains, possibly due to 
non-stationary noise associated with this band. Bottom-right: an /t-map of SNR= for 
time-shifted LIGO S5 data demonstrating the non-stationary noise sometimes associated 
with low frequencies. The 60 Hz line is masked. 
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(c) (d) 

Figure 5: Ft-maps of an ADI software injection in time-shifted data containing a glitch 
(near t ~ 490 s). Top-left is SNR and top-right is SNR=. The bottom plots show the 
recovered track without (left) and with (right) the glitch identification algorithm. The 
glitch identification algorithm excludes the glitch (visible as a vertical column of bright 
pixels) from the reconstructed track. 



torus (mass m). The spinning black hole drives magneto- hydrodynamical turbulence in 
the torus, which causes it to form clumps with mass given by em. These clumps emit 
elliptically polarized narrowband gravitational radiation for a duration of O(10 — 100 s) 
as the central black hole transfers its angular momentum to the clumps. This toy model 
provides a useful test of our algorithm because we expect many sources of long GW 
transients to be both narrowband and elliptically polarized [l|. We use M = 1OM , 
m = 1.5M , e = 0.1 and a* = 0.95 to create the « 40 s waveform used here. A 
spectrogram of this waveform can be seen in the bottom-left panel of Fig. [TJ For more 



details see 61 



In addition to the ADI model, we also consider an accretion disk fragmentation 



model from 6l|. In this model, an accretion disk associated with a long gamma-ray 
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burst forms clumps through helium photo-disintegration 14J. These clumps inspiral 
into the remnant black hole, creating a chirp-like GW signal. 

The fragmentation model can be tuned to produce shorter burst-like signals. Burst- 
like signals present an extra challenge to the glitch identification algorithm because, like 
a glitch, the power is typically concentrated in a single 0(ls)-wide /t-map column 
(though we still expect the auto-power between two interferometers to be consistent for 
a well-constructed filter). While we are primarily concerned here with long transients, we 
use a short « 1 s fragmentation waveform in Section |6] in order to study the performance 
of the glitch identification flag in this limiting case. We shall see that, while the flag 
performs best for long transients, the false dismissal rate is low even for short sig nals 



61 



are 



unless the signal is unrealistically loud. The fragmentation waveforms from 
parameterized by the mass of the central black hole M, the torus scale height rj, the torus 
viscosity a and the initial radius ro (in units of black hole mass). We use M = 1OM , 
7] = 0.8, a = 0.1 and r = 200 to create the ps Is waveform here. Ft-maps of this 
fragmentation waveform are shown in Fig. |6j 




(a) (b) 

Figure 6: Ft-maps of SNR (left) and SNR^ (right) for a ~ Is accretion disk 
fragmentation waveform injected into stationary noise (d = IMpc). 



6. Safety 

A critical aspect of any glitch identification algorithm is its safeness: the probability 
that it falsely identifies a segment associated with a GW signal as glitch-like. To test 
the safeness of our glitch flag, we apply it to ADI injections in Gaussian simulated 
noise at different sky locations. Many long transient signals (including the ADI model 
considered here) are expected to be elliptically polarized [lj]. In practice, however, it 
is possible to search for such signals with an unpolarized filter since the two-detector 
statistic Y(t; f) is largely unaffected by polarization details, if the signal is not so long 
that the polarization degeneracy is resolved by the rotation of the Earth. In this analysis 
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we use circularly polarized waveforms, a plausible model for many elliptically polarized 



62 



sources with electromagnetic triggers, which tend to be observed head-on 

In Fig. [7] we present the results of a safety study in which we perform Monte 
Carlo injections of ADI signals on top of Gaussian simulated noise. We use 312 
uniformly distributed sky directions with 20 noise realizations for each direction. To 
be flagged as glitch-like, a segment must satisfy our requirements on R(t) and SNR= 
(see Eqs. I12a|126[) . For each injection we record the fraction of segments satisfying 
the requirements on R(t) alone (blue), SNR^ alone (red) and segments meeting both 
criteria and therefore being identified as glitch-like (black). Note that our ADI signal 
spans ~ 39 data segments. For marginally detectable signals (a d = 38 Mpc signal can 
be recovered with p = 0.1%), the fraction of flagged segments is negligible. For a very 
loud signal at d — 5 Mpc (see the lower-left-hand plot in Fig. [T|), the fraction of flagged 
segments becomes 3%. We conclude that for realistic (marginally-detectable signals), 
the proposed glitch identification flag leads to a acceptably small false dismissal rate. 
In order to further reduce the false dismissal rate for very high-SNR signals, one could 
design a less aggressive auto-power cut for triggers with extremely high SNR S , but this 
is beyond our present scope. 
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Figure 7: Safety study for simulated accretion disk instability signals. The x-axis is 
the distance to the source. For each distance, we average over 312 directions and 20 
noise realizations. The y-axis is the fraction of segments satisfying the R criteria (solid 
blue), the SNR=; criteria (dashed red) and satisfying both in such a way as to be flagged 
as glitch-like (dotted black). Note that the fraction of segments flagged as glitch-like 
decreases at closer distances (corresponding to louder signals) because both detectors 
exceed the threshold on R(t), which prevents the signal from being flagged as glitch-like 
(see Eqs.MMM- 



We also consider the case of the short t « 1 s accretion disk fragmentation signal 
described in Section [5j In order to test the glitch rejection algorithm on this short 
signal, we inject the waveform on top of Monte Carlo noise. We vary the distance of 
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the injection and perform many trials at each distance, averaging over sky location. For 
a very loud d = 1 Mpc signal, the false dismissal probability is high: 21%. However, 
we find that false dismissal probability is < 1% for signals at d > 2.4 Mpc. While our 
clustering algorithm is not designed for signals that are vertical /t-map columns, we 
can estimate our sensitivity to short signals by summing all the pixels in the brightest 
column in order to calculate a total SNR for that segment jl|. For signals at d — 2.4 Mpc, 
the total SNR « 12 on average. For a quasi- normally distributed quantity like total 
SNR, this corresponds to an extremely small p-value. We conclude that even for very 
short signals, the false dismissal probability is small for signals with realistic values of 
SNR, though, unrealistically high values of SNR have a significant probability of being 
flagged as glitch-like. 

Having discussed both the efficacy of the algorithm flagging glitches as well as its 
safeness not flagging segments associated with GW signals, it is interesting to consider 
the parameter space of the cut. In Fig. [HI we show scatter plots of injected ADI signals 
(left) and noise (right) in the plane of our glitch identification parameters R(t) and 
J 7 . The horizontal and vertical lines indicate the glitch-likely thresholds. Data markers 
in the upper right-hand quadrant satisfying both cuts are flagged as glitch-like. The 
left-hand plot includes eight different ADI injection distances ranging from d = 5 Mpc 
to 40 Mpc; redder data markers correspond to smaller distances. We consider injections 
from 50 random directions at each distance, each of which is associated with 20 time 
segments, giving a total of 20 x 50 x 8 = 8000 data markers. The right-hand plot includes 
8000 data markers for S5 LIGO time-shifted data (red x's) and 8000 data markers for 
Monte Carlo Gaussian noise (green o's). Our cut is chosen to exclude the "glitch tail" 
of the red time shift distribution extending up and to the right while preserving most 
of the injected signals. Different signal models and different noise environments may 
require different cuts than the ones presented here. 



7. Comparison with other data-quality flags 



As noted above, numerous methods have been devised in order to determine when the 
strain channel is contaminated or corrupted by environmental or subsystem noise (see, 



30, 



31 



32 



33 



34 



35l . l36l . |37J). A natural question, therefore, is: to what extent 



e.g. 

does the glitch identification flag developed here provide information complementary to 
existing data-quality flags? During the S5 science run, LIGO data quality flags were 
classified in terms of numbered categories 1 — 4 30, |37| . These four categories describe 
different levels of severity: Category 1, which includes data that will not be analyzed as 
it is corrupted or contaminated by known and identified processes; Category 2, where 
the data is analyzed but various vetoes [30j, |3l|, |32|, |33|, |34|, |35|, |36|, |37| will be applied only 
in post-processing; Category 3, which are advisory flags used for detection confidence; 
and Category 4, which are advisory flags used to exert caution in case of a detection 
candidate. Comprehensive descriptions of the S5 data quality flags are fully described 
elsewhere 30, 37 . 
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Figure 8: Scatter plots of injected ADI signals (left) and time-shifted data (right red 
x's) and Monte Carlo noise (right green o's) in the plane of our glitch identification 
parameters R(t) and J- '. Injection distances range from 5—40 Mpc with smaller distances 
corresponding to redder data markers. The glitch identification thresholds for each 
parameter are represented by black lines and points in the upper right quadrant are 
flagged as glitch-like. Note that J 7 takes on discrete values as discussed below Eq. I12al 



The numbering is meant to convey the usability of the data, with Category 1 
flags representing the most contaminated data. In Fig. [9] we plot p-value vs. SNR 
for time-shifted data with no flag applied (solid blue), with SNR=-based flag applied 
(dashed red), and with various data quality flag categories applied in succession (no 
flags, Category 1 applied, then Categories 1 and 2 applied, etc). We find that the SNR=- 
based flag removes a significant number of glitches that are not already identified by 
category-numbered flags. It is evident that the two types of flags are complementary — 
our glitch identification flag finds inconsistencies in autopower between detectors 
while the category-numbered flags identify and characterize specific instrumental and 
environmental fluctuations. 

8. Conclusions 

There is strong motivation for searches for long unmodeled GW transients, but searches 
utilizing an excess cross-power statistic []J must contend with glitches, which hamper 
sensitivity. We introduce an auto-power consistency algorithm for identifying glitch-like 
data segments in searches for long GW transients and we study its behavior in various 
regimes: well-behaved noise, glitchy noise and potentially detectable GW signals. We 
find that it is effective at identifying glitches with minimal losses in data and live-time, 
thereby improving sensitivity. Yet it is safe in the sense that it does not flag GW signals 
at a high rate. Finally, we note that the glitch identification algorithm presented here 
may be useful for searches for short-duration transients. This is an area of ongoing 
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Figure 9: A plot of p-value vs. SNR for 1 s x 1 Hz resolution time-shifted data (TS) with 
no flags applied (solid blue) , with the SNR s -based flag applied (dashed red), and with 
the data quality flags [30|, |37j applied in succession (CATO representing no flags applied, 
CAT1 representing the application of the Category 1 flags, CAT2 representing Category 
1 and 2 flags, etc.). The data are parsed into 12s x 150 Hz /i-maps. 
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Appendix A. Additional formalism 

We consider the general form of a metric perturbation from a point source in the 
transverse-traceless gauge h a b(t,x). It can be written in terms of GW field Fourier 
coefficients, ^a(/) : 

/oo 
dfei(Q)h A (f)e 2 ^ t+ ^ c \ (A.l) 
-°° 

Here t is time, x is the position vector, {e^ b } are the GW polarization tensors, Cl is 
the direction to the source and A runs over + and x polarizations. The dependence 
of h a b(t,x) on Cl is implicit. The GW strain power between times t and t + St in some 
frequency band between / and / + Sf is 

Haa>& f) = ^{hA(t; f)h A ,(t; /)). (A.2) 

The factor of two comes from the fact that we consider the single-sided power spectrum 
and M is a normalization factor arising from the use of a discrete Fourier transform. 
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The semicolon emphasizes that t refers to the beginning a data segment of length 5t 
and not to the many sampling times associated with each segment. Following we 
define H(t; f) so as to be invariant under change of polarization basis: 

H(t ] f) = Tr[H AA ,(t ] f)]. (A.3) 

The metric perturbation in Eq. IA.1I induces a strain in detector / given by 

hi(t; f) = M*; f, Cl)e 2 ^ t+Cl ^Ff(t; Cl). (A.4) 



Here F T (t; Cl) is the antenna factor for detector / (see 63|) and xj is its position vector. 
The measured strain in detector / is given by the sum of hj(t; /) with a noise term 
niit-J): 

s I (t;f) = h I (t;f)+h I (t;f). (A.5) 

We assume that the noise in two interferometers is uncorrelated, which is easily achieved 
for spatially separated interferometers. 

In [l| it was shown that one can construct an unbiased estimator for H(t; f) using 
the cross-power Cjj(f) created from two spatially separated interferometers I and J. 
This estimator is given by 

Y(t; f) = Re [g 7J (t; /, Cl, a) C u (t; /)] = ^Re [Q u (t; f, Cl, a) S*(t; f)§j(t; /)] . (A.6) 

Here Qu(t; f,Cl,a) is a filter function which takes into account the phase delay from 
the spatial separation of the interferometers as well as the detection efficiency of 
interferometers / and J. It also depends on a, which is a set of parameters that 
characterizes the expected form of Haa'U) such as the polarization of the source. We 
can write the filter function as: 

Quit; f, Cl, a) = 1 2*iffi.*B IJ ,c (A 7) 

e u {t; 11, a) 

Here Axjj = xj — xj is the difference in position vectors for detectors / and J. 
ejj(t; Cl, a) is the "pair efficiency," which is defined in terms of the expectation value of 
interferometer cross- and auto-powers: 

(C u (t; /)> = e u (t; Cl, a)H(t; f) e -^f^^ (A.8) 
(Pj(t;f)) =e n (t;Cl,a)H(t;f) + Nj(t;f) (A.9) 

where JV>(t; /) = {2/M)\n I {t; f)\ 2 and H(t; f) is defined in Eq. |Aj2 The pair efficiency 
for an unpolarized source is: 

e u (t; Cl, unpolarized) = - F 7 A (t; Cl)Ff(t; Cl). (A. 10) 

2 A 

Hereafter we abbreviate ejj{t; Cl, a) as simply ejj. 

Through our definition of Quit; f, Cl, a), we implicitly assume that the direction of 
the source Cl is known. In order to estimate how well we must know Cl, we consider how 
large the error in Cl (denoted 56) must be before we lose too much signal power. If we 
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demand that we measure a fraction of at least R of the total possible power, then we 
can tolerate angular errors 59 of 

"S^'KsTika)- (A ' n) 

For the Hanford-Livingston pair, this implies that we can tolerate angular errors of 
59 < 0.8° up to 500 Hz with R = 90%. For comparison, we note that the Swift 
experiment has an angular resolution of ~ 0.25° 0|. For the remainder of the paper, 
we consider a single search direction. For triggers with large error regions on the sky, 
one can iterate over a grid of points inside a search cone, but this is a trivial extra step. 
Since by assumption the noise in detectors / and J is uncorrelated, it follows that 

(Y(t;f)) = H(t;f). (A.12) 

An estimator for the variance of Y(t; f) is given by [l|: 

f? = \ \Qu(t\ /, «) f p&\ f)Pj(t\ /)■ (A.i3) 

Here Pj(t; f) and Pj(t; f) are the auto-powers measured in detectors / and J, 
respectively. The prime denotes that they are calculated using 2n neighboring segments 
in order to obtain an estimate of the noise associated with the segment beginning at t: 

~t=to +nSt 



,t=to —nSt 



^A(t ;/). (A.14) 



References 



[1] Thrane E, Kandhasamy S, Ott C D et al. 2011 Phys. Rev. D 83 083004 

[2] Ott C D 2009 Class. Quantum Grav. 26 063001 

[3] Dessart L, Burrows A, Livne E and Ott C D 2006 Astrophys. J. 645 534 

[4] Miiller E, Rampp M, Buras R, Janka H T and Shoemaker D H 2004 Astrophys. J. 
603 221 

[5] Keil W, Janka H T and Miiller E 1996 Astrophys. J. Lett. 473 111 

[6] Miralles J A, Pons J A and Urpin V A 2000 Astrophys. J. 543 1001 

[7] Miralles J A, Pons J A and Urpin V 2004 Astron. Astrophys. 420 245-249 

[8] Corsi A and Meszaros P 2009 702 1171 

[9] Piro A L and Ott C D 2011 736 1171 

[10] Ott C D, Dimmelmeier H, Marek A, Janka H T, Hawke I, Zink B and Schnetter E 
2007 Phys. Rev. Lett. 98 261101 

[11] Ou S, Tohline J E and Lindblom L 2004 Astrophys. J. 617 490 

[12] Scheidegger S, Kappeli R, Whitehouse S C, Fischer T and Liebendorfer M 2010 
Astron. Astrophys. 514 A51 

[13] Lai D and Shapiro S L 1995 Astrophys. J. 442 259 



REFERENCES 



20 



Piro A L and Pfahl E 2007 Astrophys. J. 658 1173 
van Putten M 2002 Astrophys. J. Lett. 575 71-74 
van Putten M 2001 Phys. Rev. Lett. 87 091101 
van Putten M 2008 Astrophys. J. Lett. 684 91 

Andersson N, Comer G L and Langlois D 2002 Phys. Rev. D 66 104002 

Glampedakis K, Samuelsson L and Andersson N 2006 Mon. Not. R. Ast. Soc. Lett. 
371 74 

Samuelsson L and Andersson N 2007 Mon. Not. R. Ast. Soc. 374 256 
Levin Y 2006 Mon. Not. R. Ast. Soc. Lett. 368 35 

Sotani H, Kokkotas K D and Stergioulas N 2008 Mon. Not. R. Ast. Soc. Lett. 385 
5 

Horvath J E 2005 Modern Physics Lett. A 20 2799 

de Freitas Pacheco J A 1998 Astron. Astrophys. 336 397 

Ioka K 2001 Mon. Not. R. Ast. Soc. 327 639 

Vaishnav B, Hinder I, Shoemaker D and Herrmann F 2009 Class. Quantum Grav. 
26 204008 

Levin J and Contreras H 2010 Inspiral of generic black hole binaries: Spin, 
precession, and eccentricity http://arxiv.org/abs/1009.2533 

O'Leary R M, Kocsis B and Loeb A 2009 Mon. Not. R. Ast. Soc. 395 2127 

Kocsis B and Levin J 2011 in preparation {Preprint 
|http : //arxiv . org/abs/1109 . 4170P 

Blackburn L et al. 2008 Class. Quantum Grav. 25 184004 

Smith J R et al. 2011 A hierarchical method for vetoing noise transients in 
gravitational-wave detectors submitted to Class. Quantum Grav. 

Ajith P et al. 2006 Class. Quantum Grav. 23 5825 

Coughlin M for the LVC 2011 arXiv: 1108. 1521 

Isogai T for the LIGO Scientific Collaboration and the Virgo Collaboration 2010 
J. Phys. Conf. Ser. 243 012005 

Ballinger T for the LIGO Scientific Collaboration and the Virgo Collaboration 2009 
Class. Quantum Grav. 26 204003 

Christensen N for the LIGO Scientific Collaboration and the Virgo Collaboration 

2010 Class. Quantum Grav. 27 194010 

Slutsky J et al. 2010 Class. Quantum Grav. 27 165023 

Cannon K C 2008 Class. Quantum Grav. 24 105024 

Hild S et al. 2007 Class. Quantum Grav. 24 3783 

Abbott B et al. 2009 Rep. Prog. Phys. 72 076901 

Abbott B et al. 2009 Nature 460 990 



REFERENCES 



21 



Abbott B et al. 2008 Astrophys. J. Lett. 683 45 

Acernese F for the Virgo Collaboration 2006 Class. Quantum Grav. 23 S63 
Acernese F et al. 2008 Class. Quantum Grav. 25 184001 
Accadia T et al. 2011 Class. Quantum Grav. 28 114002 
https: / /wwwcascina. virgo.infn.it/advirgo/ 

Kuroda K for the LCGT Collaboration 2010 Class. Quantum Grav. 27 084004 
http: / / gw.icrr.u-tokyo.ac.jp/lcgt / 

The LIGO Scientific Collaboration 2011 Nature Physics Online 
Doi:10.1038/nphys2083 

Willke B et al. 2006 Class. Quantum Grav. 23 S207-S214 

Grote H for the LIGO Scientific Collaboration 2008 Class. Quantum Grav. 25 
114043 

Grote H for the LIGO Scientific Collaboration 2010 Class. Quantum Grav. 27 
084003 

Abbott B et al. (LIGO Scientific Collaboration) 2007 Phys. Rev. D 76 082003 



{Preprint |astro-ph/0703234 ) 



Ballmer S 2006 LIGO interferometer operating at design sensitivity with application 
to gravitational radiometry Ph.D. thesis Massachusetts Institute of Technology 

The LIGO and Virgo Collaborations 2011 Directional limits on gravitational waves 
using LIGO S5 science data in preparation 

Khan R and Chatterji S 2009 Class. Quantum Grav. 26 155009 

Abbott B et al. (LIGO Scientific Collaboration) 2009 Astrophys. J. Lett. 701 68-74 

Abbott B et al. (LIGO Scientific Collaboration) 2008 101 211102 

Corsi A and Meszaros P 2009 Class. Quantum Grav. 26 204016 

Stella L et al. 2005 Astrophys. J. Lett. 634 LI 65 

Santamarfa L and Ott C D 2011 LIGO DCC Tl 100093 



https : / /dec . ligo . org/ cgi-bin/DocDB/ShowDocument?docid=38606 



Kobayashi S and Meszaros P 2003 Astrophys. J. Lett. 585 L89 
Allen B and Romano J D 1999 Phys. Rev. D 59 102001 
Gehrels N et al. 2004 Astrophys. J. 611 1005 



